246 research outputs found
Optimal control methods for simulating the perception of causality in young infants
There is a growing debate among developmental theorists concerning the perception of causality in young infants. Some theorists advocate a top-down view, e.g., that infants reason about causal events on the basis of intuitive physical principles. Others argue instead for a bottom-up view of infant causal knowledge, in which causal perception emerges from a simple set of associative learning rules. In order to test the limits of the bottom-up view, we propose an optimal control model (OCM) of infant causal perception. OCM is trained to find an optimal pattern of eye movements for maintaining sight of a target object. We first present a series of simulations which illustrate OCM's ability to anticipate the outcome of novel, occluded causal events, and then compare OCM's performance with that of 9-month-old infants. The impications for developmental theory and research are discusse
Learning Parameterized Skills
We introduce a method for constructing skills capable of solving tasks drawn
from a distribution of parameterized reinforcement learning problems. The
method draws example tasks from a distribution of interest and uses the
corresponding learned policies to estimate the topology of the
lower-dimensional piecewise-smooth manifold on which the skill policies lie.
This manifold models how policy parameters change as task parameters vary. The
method identifies the number of charts that compose the manifold and then
applies non-linear regression in each chart to construct a parameterized skill
by predicting policy parameters from task parameters. We evaluate our method on
an underactuated simulated robotic arm tasked with learning to accurately throw
darts at a parameterized target location.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Adaptive Critics and the Basal Ganglia
One of the most active areas of research in artificial intelligence is the study of learning methods by which “embedded agents ” can improve performance while acting in complex dynamic environments. An agent, or decision maker, is embedded in an environment when it receives information from, and acts on, that environment in an ongoing closed-loop interaction. An embedded agent has to make decisions under time pressure and uncertainty and has to learn without the help of an ever-present knowledgeable teacher. Although the novelty of this emphasis may be inconspicuous to a biologist, animals being the prototypical embedded agents, this emphasis is a significant departure from the more traditional focus in artificial intelligence on reasoning within circumscribed domains removed from the flow of real-world events. One consequence of the embedded agent view is the increasing interest in the learning paradigm called reinforcement learning (RL). Unlike the more widely studied supervised learning systems, which learn from a set of examples of correct input/output behavior, RL systems adjust their behavior with the goal of maximizing the frequency and/or magnitude of the reinforcing events they encounter over time. While the core ideas of modern RL come from theories of animal classical and instrumenta
Using Relative Novelty to Identify Useful Temporal Abstractions in Reinforcement Learning
We present a new method for automatically creating useful temporal abstractions in reinforcement learning. We argue that states that allow the agent to transition to a different region of the state space are useful subgoals, and propose a method for identifying them using the concept of relative novelty. When such a state is identified, a temporallyextended activity (e.g., an option) is generated that takes the agent efficiently to this state. We illustrate the utility of the method in a number of tasks
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density
This paper presents a method by which a reinforcement learning agent can automatically discover certain types of subgoals online. By creating useful new subgoals while learning, the agent is able to accelerate learning on the current task and to transfer its expertise to other, related tasks through the reuse of its ability to attain subgoals. The agent discovers subgoals based on commonalities across multiple paths to a solution. We cast the task of finding these commonalities as a multiple-instance learning problem and use the concept of diverse density to find solutions. We illustrate this approach using several gridworld tasks
Accelerating Reinforcement Learning through the Discovery of Useful Subgoals
An ability to adjust to changing environments and unforeseen circumstances is likely to be an important component of a successful autonomous space robot. This paper shows how to augment reinforcement learning algorithms with a method for automatically discovering certain types of subgoals online. By creating useful new subgoals while learning, the agent is able to accelerate learning on a current task and to transfer its expertise to related tasks through the reuse of its ability to attain subgoals. Subgoals are created based on commonalities across multiple paths to a solution. We cast the task of finding these commonalities as a multiple-instance learning problem and use the concept of diverse density to find solutions. We introduced this approach in [10] and here we present additional results for a simulated mobile robot task
Novelty or surprise?
Novelty and surprise play significant roles in animal behavior and in attempts to understand the neural mechanisms underlying it. They also play important roles in technology, where detecting observations that are novel or surprising is central to many applications, such as medical diagnosis, text processing, surveillance, and security. Theories of motivation, particularly of intrinsic motivation, place novelty and surprise among the primary factors that arouse interest, motivate exploratory or avoidance behavior, and drive learning. In many of these studies, novelty and surprise are not distinguished from one another: the words are used more-or-less interchangeably. However, while undeniably closely related, novelty and surprise are very different. The purpose of this article is first to highlight the differences between novelty and surprise and to discuss how they are related by presenting an extensive review of mathematical and computational proposals related to them, and then to explore the implications of this for understanding behavioral and neuroscience data. We argue that opportunities for improved understanding of behavior and its neural basis are likely being missed by failing to distinguish between novelty and surpris
- …